library(tidyverse)
library(gapminder)
library(here)
library(socviz)
library(ggrepel)
rel_by_region <- gss_sm %>%
group_by(bigregion, religion) %>%
summarise(N = n()) %>%
mutate(freq = N/sum(N),
pct = round(freq*100,0))
Figure 5.2: Religious preferences by region
Figure 5.3: Religious preferences by region, faceted version
## # A tibble: 10 x 6
## country year donors pop pop_dens gdp
## <chr> <date> <dbl> <int> <dbl> <int>
## 1 Denmark NA NA NA NA NA
## 2 Australia 1994-01-01 10.2 17855 0.231 19849
## 3 France NA NA 56709 10.3 18162
## 4 Germany 1998-01-01 13.4 82047 23.0 23283
## 5 Sweden 1998-01-01 14.6 8851 1.97 23525
## 6 Netherlands 2001-01-01 11.6 16046 38.6 28756
## 7 Finland 1996-01-01 19.5 5125 1.52 19842
## 8 Switzerland 1999-01-01 14.4 7144 17.3 28562
## 9 Spain NA NA 38850 7.68 12971
## 10 Italy NA NA NA NA NA
Figure 5.4: Not informative scatterplot
Figure 5.5: A faceted lineplot
Figure 5.6: A first attempt at boxplots by country
Figure 5.7: Moving countries to the y - axis
Figure 5.8: Boxplots reordered by median donation rate
Figure 5.9: A boxplot with the fill aesthetic mapped
Figure 5.10: Using points instead of a boxplot
Figure 5.12.1: A jittered plot
Figure 5.12.2: A jittered plot with width parameters changed
Figure 5.13: A Cleveland dotplot, with colored points
Figure 5.14: A faceted dotplot with free scales on the y-axis
Figure 5.15: A dot-and-whisker plot, with the range defined by the standard deviation of the measured variable.
Figure 5.16: Plotting labels and text
p <- ggplot(data = by_country,
mapping = aes(x = roads_mean, y = donors_mean))
p + geom_point() +
geom_text(aes(label = country), hjust = 0)
Using different values of hjust to adjust the labels is not a robust approach, because the space is added in proportion of the lenght of the label. Thus, longer labels move further away from the points than you want.
ggrepel provides robust extensions to adding labels and text.
Figure 5.18: Text labels with ggrepel
Figure 5.19: Top: Labeling text according to a single criterion. Bottom: Labeling according to several criteria.
p <- ggplot(data = by_country,
aes(x = gdp_mean, y = health_mean))
p + geom_point() +
geom_text_repel(data = subset(by_country, gdp_mean > 25000),
mapping = aes(label = country))
p <- ggplot(data = by_country,
mapping = aes(x = gdp_mean, y = health_mean))
p + geom_point() +
geom_text_repel(data = subset(by_country,
gdp_mean > 25000|health_mean < 1500|country %in% "Belgium"),mapping = aes(label = country))
Figure 5.20: Labeling using a dummy variable
organdata <- organdata %>%
mutate(ind = case_when(
ccode %in% c("Ita", "Spa") & year >1998 ~TRUE,
TRUE ~ FALSE
))
p <- ggplot(organdata,
mapping = aes(x = roads,
y = donors,
color = ind))
p + geom_point() +
geom_text_repel(data = subset(organdata, ind),
mapping = aes(label = ccode)) +
guides(label = FALSE, color = FALSE)
Figure 5.21: Arbitrary text with annotate()
Figure 5.22: Using two different geoms with annotate()
Each geom_ function takes mappings tailored to the king of graph it draws.
Graphs have other features not strictly connected to the logical structure of the data being displayed: background color, typeface, or legend placement. These are controlled using theme.
mapping -> scale_ legend -> guides() superfluous -> theme()
Figure 5.23: Every mapped variable has a scale
Figure 5.25: Making some scale adjustments
Figure 5.26: Relabeling via a scale function